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Over the past decades, brain-computer interface (BCI) has gained a lot of 
attention in various fields ranging from medicine to entertainment, and 
electroencephalogram (EEG) signals are widely used in BCI. Brain- 
computer interface made human-computer interaction possible by using 
information acquired from EEG signals of the person. The raw EEG signals 
need to be processed to obtain valuable information which could be used for 
communication purposes. The objective of this paper is to identify the best 
combination of features that could discriminate cognitive stimuli-based 
tasks. EEG signals are recorded while the subjects are performing some 
arithmetical based mental tasks. Statistical, power, entropy, and fractional 
dimension (FD) features are extracted from the EEG signals. Various 
combinations of these features are analyzed and validated using random 
forest classifier, K-nearest neighbors (KNN), multilayer perceptron, linear 


discriminant analysis, and support vector machine. The combination of 
entropy-FD features gives the highest accuracy of 90.47% with the KNN 
algorithm when compared to individual entropy and FD features which 
achieves 79.36% with random forest classifier, multilayer perceptron, and 
82.53% with linear discriminant analysis, respectively. Our results show that 
the hybrid of entropy-FD features with KNN classifier can efficiently 
classify the cognition-based stimuli. 
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1. INTRODUCTION 

Brain-computer interface (BCI) acts as an artificial and alternative output channel for the brain 
which is similar to the normal output channels like muscles and peripheral nerves. Hence, BCI is defined as 
“a brain-computer interface is a communication system that does not depend on the brain’s normal output 
pathways of peripheral nerves and muscles” [1]. BCI requires two adaptive controllers: A brain from where 
the electrical activity is recorded and a system that converts this electrical activity into control commands. 
Gain [2] discussed the function of various lobes of the brain in human behaviors. The temporal lobe is 
responsible for language processing, the occipital lobe for visual processing, the parietal lobe for sensations, 
frontal lobe for cognition and emotions. BCI has an input, an output, and a translation algorithm. The input is 
the features of the signals recorded from the brain. Some of these features are the time-domain and frequency 
domain. The translation algorithm such as linear/nonlinear equations, neural networks, and converts these 
input features into control signals. These output control signals are used to control or operate any device [3]. 
Event-related potentials are responses from the brain under certain conditions like giving external stimuli. 
There are two types of event related potentials (ERPs): exogenous and endogenous. Exogenous ERPs are 
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responses generated from the brain spontaneously as a result of external stimulus regardless of the subject’s 
thinking or actions. Endogenous ERPs are responses that are generated while the subjects try to respond to 
external stimuli through thinking, imagination, or emotions. For example, solving the given mathematical 
problem. These kinds of ERPs are also called cognitive ERPs [4]. Most of the BCI application consists of the 
following steps: preprocessing, channel selection, feature extraction, feature optimization, and classification 
[5]. The main challenge while dealing with electroencephalogram (EEG) signals is extracting appropriate 
features because of the non-stationary property of EEG signals and the number of channels used. Since the 
EEG brain signals are non-stationary and recorded in the time-domain, it is necessary to analyze the EEG 
data from multiple domains which gives enhanced information about the time and frequency-related 
information of the recorded signals. The objective of this paper is to give an overview of various existing 
techniques feature extraction techniques and to extract features from time as well as frequency domain and 
analyze the impact of various combinations of features over the classification accuracy to find out which 
combination performs better on cognitive stimuli. The categories of features extracted in this study are 
statistical features (S), power features (P), entropy features (E), and fractional dimension (F) features. Two 
methods are employed to analyze the performance of the classifiers based on the features: i) combine each 
feature with respect to their categories and compare the accuracies of each category and ii) combine each 
category of features in multiple combinations and compare the accuracies of each combination of feature 
categories. 

In this study, we use various classifiers such as random forest classifier (RFC), K-nearest neighbors 
(KNN), multilayer perceptron (MLP), linear discriminant analysis (LDA), and support vector machine 
(SVM) to classify EEG data into thirteen different classes of mental tasks (i.e., thirteen stimuli). Our results 
show that the combination of entropy-FD features employed in method (ii) with KNN gives the highest 
accuracy of 90.47%. 


2. RELATED WORKS 

Feature extraction methods are necessary to get the salient features from time-domain EEG signals 
which effectively classify the data. The features that can be obtained from the preprocessed signals belong to 
time-domain, frequency-domain, time-frequency domain, and spatial domain. 


2.1. Time-domain features 

In the time domain, power is analyzed with respect to time. Here, event-related potentials invoked 
by external stimuli act as a command. Examples for time-domain include P300 potentials, and slow cortical 
potentials [1]. Choi et al. [6] has examined the brain responses across different regions while classifying 
mathematical and baseline tasks which is an endogenous paradigm i.e., without external stimuli. Ear-EEG is 
used for recording these self-modulated signals from the brain while performing mathematical tasks. Here 
among the statistical features, mean, standard deviation, mean absolute value (MAV) of the first and second 
difference of raw and standardized signals are widely used. Nawaz et al. [7] proposed that time domain-based 
statistical features and SVM with RBF kernel give better accuracy when compared to power, wavelet, FD, 
entropy features. However, separating noise from the signal is considered a challenging task with time- 
domain features alone [8]. 


2.2. Frequency domain features 

In the frequency domain, power is analyzed with respect to frequency. Here, the amplitude of 
frequency sub-bands acts as a command, examples for frequency domain include rhythms like a, and B [1]. 
Power-based feature extraction deals only with frequency sub bands that make it a frequency domain feature 
extraction [9]. Fast Fourier transform (FFT) is used for spectral analysis of a given signal which is stationary 
which makes it not suitable for EEG signals. FFT involves applying discrete FFT on the signal to find its 
frequency [10]. Wang et al. [11] utilizes frequency domain features using canonical correlation analysis 
(CCA) algorithm and power spectral density (PSD) techniques for more optimization. Stimulus frequency 
was identified using the above techniques. Frequency domain feature extraction is widely suggested, and 
PSD is a widely used technique for extracting frequency domain features. According to the Wiener- 
Khintchine theorem, PSD is calculated by applying Fourier transform on the autocorrelation function R,,(m) 
[12], or equivalently, PSD is calculated by taking the average of the squared magnitude of the Fourier 
transform [13]. Akrami et al. [14] proposed that logarithmic PSD is considered as the best method suitable 
for recognizing patterns from EEG. Shen et al. [15] proposed a method called WPT-BED to classify the 
cognitive tasks based on judgment where wavelet packet transform with db4 wavelet is used to decompose 
the signal into frequency bands and bispectrum features are extracted from the decomposed frequency bands. 
Then the sub-bands are reconstructed and bispectral eigenvalues of differential signals (BED) are used to 
optimize bispectral features from the resultant time-domain signal. The optimized features are then classified 
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using SVM. PSD might ignore certain frequency features such as phase which is very important in 
processing EEG signals. BED features improve classification accuracy by considering an ample amount of 
information that is not considered in PSD. 


2.3. Time-frequency domain features 

Wavelet is suitable for analyzing non-stationary signals like EEG. Wavelet-based technique deals 
with both temporal and frequency ranges hence make it both time and frequency domain feature extraction 
technique [9]. The various types of wavelet transforms are discrete wavelet transform (DWT), continuous 
wavelet transform (CWT), tunable Q-factor wavelet transform (TQWT), dual tree-complex wavelet 
transform (DT-CWT) which is used in splitting signals into various frequency sub-bands (signal 
decomposition). The main drawback of FFT is it extracts the frequency features by taking the average over 
the entire signal without considering the difference in the time domain, which makes FFT only suitable for 
extracting frequency related feature from only signals which are stationary in the time domain. Since EEG is 
non-stationary, short time Fourier transform (STFT) is used for representing time-frequency features of the 
signal. The idea behind STFT is that the entire signals are divided into segments and apply FFT to segmented 
signals which are stationary in each segment. Hence, it provides frequency related information with respect 
to time interval [11]. STFT involves applying windows to the raw signals and the FFT is applied to the 
resultant signals [10]. The major drawback of STFT is that the window size is fixed which limits its 
capability to distinguish among various features and provides limited information regarding the location of 
frequency changes. On the other hand, wavelet transform or decomposition represents the features in a time- 
frequency domain called scalograms by decomposing the signals into various sub bands. Wavelet 
transform/decomposition helps to find the location of frequency changes in each sub band [8]. The drawback 
of STFT could be overcome by a CWT. In CWT, the window size can be changed based on the spectral 
component. CWT provides the “high localization of time in high-frequency EEG signals” as well as a large 
number of waveforms apart from sinusoidal waveform [16]. The major drawback of CWT is the scaling 
value ‘a’ and translation value ‘b’ change continuously which yields a lot of unrelated information. This 
drawback can be overcome by DWT which represents features at multiple levels [17]. DWT is used along 
with Daubechies 4" order wavelet as mother wavelet to decompose signals into approximation and detailed 
coefficients. These coefficients are decomposed recursively which results in the high pass and low pass filters 
to get the frequency sub bands (between 0 and 50 Hz). Daubechies 4" order wavelet (db4) is widely used 
because it resembles EEG waveforms. DT-CWT is similar to DWT but has better approximate shift variance 
and anti-aliasing than DWT [12]. Wavelet decomposition (WD) decomposed the raw signals only into lower 
frequency sub bands, but high frequencies are detected while performing mental tasks. Another drawback is 
the deterioration of feature quality due to the quick reduction of wavelet coefficients. To overcome this issue, 
the wavelet packet decomposition (WPD) is used to decomposed the raw signals into both lower and higher 
frequency sub-bands [18]. 

Mini et al. [19] adopted DWT, WPD, and DWPD which is a combination of the DWT and WPD 
wavelet decomposition techniques where DWT was applied to detailed coefficient and WPD was applied to 
the approximate coefficient for the further decomposition of signals. WPD gives high accuracy when 
compared to other methods. Wavelet packet node reconstruction (WPNR) and wavelet node reconstruction 
(WNR) are responsible for reconstructing signals from their respective nodes [18]. Shen et al. [15] proposed 
a method that utilizes hybrid EEG features for the identification of DRDS tasks. Features are extracted using 
the one-vs-one method from the particular channels that had been selected using CSG techniques. The 
extracted features are time-frequency domain features. Chatterjee et al. [9] proposed a method where features 
are extracted based on wavelets and power. The feature extraction techniques are wavelet-based energy- 
entropy, wavelet-based root means square, PSD-based band power, PSD-based average power, and their 
combinations. It is concluded that wavelet-based features such as Wavelet-based energy-entropy, wavelet- 
based root mean square lead to better performance than power-based features with classifiers such as 
logistics and SVM. Chatterjee and Bandyopadhyay [20] concluded that wavelet-based energy-entropy as a 
feature gives better accuracy when compared to statistical features and power features. Murugappan et al. 
[21] proposed certain energy-based features such as recoursing energy efficiency (REE), logarithmic REE 
(LREE), and absolute logarithmic REE (ALREE) and classified these features using two linear classifiers 
such as KNN and LDA. Here KNN performs better with a maximum accuracy of 83.26% with ALREE 
features. Hence it is concluded that energy features perform better than power and conventional features [22]. 
Harpale and Bairagi [13] suggested that wavelet-based analysis gives better accuracy for feature extraction 
techniques. Wavelet-based decomposition and features are considered as better than FFT and STFT because 
wavelet decomposition separates the signal into detailed and approximation coefficients iteratively where we 
can get improved details of signal and better time-frequency representation while the latter seems to give the 
least time/frequency information and least information about signals [22]. 
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2.4. Hybrid features 

Wei et al. [12] proposed a method in which time-domain, frequency-domain, and non-linear analysis 
features are extracted and used. In this method, raw EEG signals are preprocessed by filtering and 
decomposition into sub-bands using DT-CWT. Then the time-domain features are extracted using MAV, 
frequency-domain features by PSD, and non-linear analysis by fractional dimension (FD) and differential 
entropy (DE). Then these four features along with the best two frequency bands are given as input to the 
simple recurrent unit and by ensemble methods like voting and then the weighted average has been done to 
accomplish the classification task [12]. Suleiman and Fatehi [10] proposed that time-frequency-space 
analysis performed better than time/frequency domain and time-frequency domain. In multichannel EEGs, 
space-time-frequency (STF) is used for selecting signals from the appropriate regions or channels. This can 
be done by applying STFT on multiple electrodes to choose a channel. The selected channel is then combined 
with one of the channels and is sent as an input to the MLP which uses the back propagation algorithm. But 
in this method, no specific method was mentioned to select the best combination of channels which is 
necessary for extracting STF features [10]. Bajaj et al. [23] utilizes wavelet transform with statistical 
features. The raw EEG signals are decomposed into high and low pass sub bands using TQWT and features 
are extracted from these sub bands using statistical feature extraction methods like Horthy mobility (HM), 
minima, maxima, mean and standard deviation. Bandil and Wadhwan [24] proposed a method for epileptic 
classification in which DWT is used to decompose the EEG signals into 5 sub-bands with db4 mother 
wavelet. Then the signals are standardized to reduce the impact of higher estimated factors over the lesser 
ones. Morphological features like AR coefficient, and PSD, and statistical features like mean, median, mode, 
and entropy features are extracted. Harpale and Bairagi [13] proposed a method that classifies seizure and 
non-seizure EEG signals using hybrid features. Features are extracted from both time and frequency domains 
such as mean, coefficient of variation (COV), root mean square (RMS), kurtosis, and PSD respectively. By 
applying pattern adapted wavelet transform, features like mean, RMS, PSD, and standard deviation are 
extracted. Liu et al. [25] proposed a method in which features are extracted from the time domain, frequency 
domain, time-frequency domain, and multi-electrodes. Relevant features from all of these domains are 
selected based on maximum relevance and minimum redundancy as a feature selection method. Features are 
also extracted from the appropriate combination of channels that leads to better accuracy. Multi electrode 
features focus on extracting features based on the interconnections between electrodes that are attached to 
different brain regions [25]. Garg and Verma [8] proposed wavelet-based feature extraction techniques for 
classifying scalograms using neural networks. CWT is used to decompose signals into scalograms for better 
time-frequency representation of signals. Then scalogram images were fed into a convolutional neural 
network (CNN) where the spatial feature i.e., power of each frequency band in the scalogram images, is 
extracted in the pooling layer [16]. Various feature extraction techniques are summarized in Table 1. Based 
on various studies discussed above, hybrid features are considered to improve accuracy when compared to 
using a single feature or combination at a time. 


Table 1. Various feature extraction techniques 


Author Preprocessing Domain Features Feature Classification Performance 
Extraction Algorithms 
Techniques 
[15] Bandpass filter, Time-frequency Wavelet CSG and SVM with RBF kernel Accuracies for five 
ICA domain OVO subjects - 94.67%, 
91.33%, 0.00%, 
87.67%, 73.83% 
[11] High-pass filter, Frequency Power CCA and PSD Voting mechanism Accuracy exceeds 
noise removal domain 72.84% 
[12] DT-CWT Time-domain, Hybrid MAV, PSD, Simple recurrent units MAV-79.22%, 
frequency FD and DE and ensemble methods PSD — 78.29%, 
domain, non- such as voting and FD — 77.22%, 
linear analysis weighted average DE — 80.02% 
[6] Bandpass filter, Frequency- CSP sLDA Accuracy- 75.6% 
fourth-order domain 


Butterworth filter 
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Table 1. Various feature extraction techniques (continue) 
Author Preprocessing Domain Features Feature extraction Classification Performance 
techniques Algorithms 
[9] Elliptic Time- Wavelet Wavelet-based energy- Logistic ROC - 0.918 Recall - 
bandpass filter frequency entropy, wavelet-based 0.821 Precision - 0.821 
domain root mean square Accuracy - 82.14 
SVM ROC - 0.850 
Recall - 0.850 
Precision - 0.852 
Accuracy — 85 
MLP ROC - 0.917 
Recall - 0.836 
Precision - 0.839 
Accuracy - 83.57 
[10] Notch filters, Space-time- Hybrid FFT, STFT MLP Classification accuracy — 
high and low frequency- 99% (two tasks) and 
pass filters domain and 96%(three tasks) 
time-frequency 
domain 
[23] TQWT Time-domain Hybrid Hjorth mobility, ELM Accuracy — 91.842% 
minima, maxima, 
mean and standard 
deviation 
[18] Notch filter Time- Hybrid Interchannel SVM with Accuracy — 86% 
frequency correlation coefficient polynomial 
domain and statistical features kernel 
[24] DWT Time-domain Hybrid Morphological and ANN Accuracy-99% 
and statistical features 
Frequency- 
domain 
[13] ICA Time-domain, Hybrid Standard deviation, Fuzzy Accuracy - 96.48% 
frequency- variance, RMS, inference 
domain, and kurtosis, SUM, POW, system 
time-frequency and PSD 
domain 
[7] Time-window Time-domain Statistical, Mean, SD, MAV SVM with Accuracy — 77.62%, 
segmentation FD RBF kernel 78.96%, 77.60% (valence, 
arousal, dominance) 
[25] High pass filter Time-domain, Hybrid Mean, SD, MAV, Random forest Accuracy — 
frequency- HOC,FD, Hjorth, NSI, 71.23% ,69.9%(Arousal 
domain, and PSD, REE, RMS, and Valence) 
time-frequency entropy, multi 
domain electrode features such 
as DA, RA, MSCE 
[14] Bandpass filter Frequency- Hybrid Logarithmic PSD Neural network Not mentioned 
domain 
[8] Bandpass filter Time- Hybrid Wavelet, CWT, GoogleNet Maximum accuracy of 
frequency spatial, feature based CNN 92.19% 
domain and extraction 
spatial domain 
[22] Average mean Time- Wavelet DWT FCM, FKM Not mentioned 
reference frequency and 
(AMR) domain entropy 
[21] Surface Time- Wavelet DWT, ALREE KNN 83.26% 
Laplacian frequency and energy 
domain 
[5] High pass filter, Frequency Power BED SVM 84.38% 
low pass filter domain 
and ICA 
[26] High pass filter, Time-domain, Hybrid Statistical features, FD, Unsupervised Maximum accuracy of 
low pass filter, frequency- Hjorth features, PSD, Hyperplane 71.53% 
and ICA domain, and Coif! wavelet, energy, partitioning 
time-frequency and entropy 
domain 
[11] Bandpass filter Time- Wavelet STFT CNN 90.59% 
frequency 
domain 
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3. METHODOLOGY 

In the current study, we have implemented machine learning techniques to classify cognitive-based 
stimuli (arithmetic mental tasks) using the EEG data. A total of 13 stimuli are used for each subject and EEG 
signals are recorded while performing the mental calculation. The overall framework of the study is depicted 
in Figure 1. Firstly, the signals are segmented into segments of 10 seconds. Secondly, the four categories of 
features are extracted from the time and frequency domain. Thirdly, the different combinations are made and 
given as input to the classifiers. Finally, the best combination of features along with the classifier is noted to 
classify the cognitive stimuli-based EEG signals. 


PREPROCESSING 


‘Jul tu gh WL 


DATA ACQUISITION 


Winey ro 
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Raw EEG AE Segmented signals based on stimuli (each 10 sec) 


Take each channel 
Take each EEG segment 


FEATURE EXTRACTION 
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Statistical Power Entropy Dimensional 
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Identify the best combination of features along with classifier 


OUTPUT CLASSIFICATION 


Figure 1. Framework of the study 


3.1. Data acquisition and pre-processing 

EEG is a non-invasive technology used to measure brain activity. In the project, a gTech recorder 
which consists of 16 electrodes was used to measure brain activity. The EEG signals were measured across 
16 different channels such as FP2, F4, C4, P4, F8, T4, T6, O2, FP1, F3, C3, P3, F7, T3, T5, O1, and Ref. The 
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electrodes were placed according to the international standard 10-20 positioning system [10]. A sampling 
frequency of 512 Hz was used. Sensitivity was set to 2.5uV/mm. The low pass filter of 1.0Hz was used to 
remove high-frequency noise [7]. The notch filter was set to 50 Hz to remove exceeded power supply [7]. 

The subject is made to sit on a chair in a comfortable position. The electrodes were attached to the 
scalp by using a gel (Ten20 Conductive gel). Then tapes were attached to the electrodes to prevent them from 
moving. The reference electrode was placed in the right ear. The readings were taken from various healthy 
subjects (age group between 22 and 25). The simple mental arithmetic tasks (i.e., basic addition, subtraction, 
multiplication, and division problems such as 10+5, 5-3, 5*5, and 6/2, ) are shown and used as the cognitive 
stimuli and 2 trials were taken for each subject. Each event was recorded with a time duration of 10 seconds. 
The cognitive stimuli consisted of the 13 mental arithmetic tasks and each lasted for 10 seconds with a time 
break of 10 seconds. At the start of the experiment, there was a time break of 120 seconds. The raw EEG 
signals are processed in such a way that only the signals from the performance period are taken into account 
and the signals that are recorded during the resting state will be discarded. Therefore, the raw EEG data is 
segmented for every 10 seconds based on the target class which in this case is the stimuli. The feature 
extraction takes place on these segmented signals. 


3.2. Feature extraction 

The main aim of feature extraction is to obtain salient features from the EEG signals that could 
effectively classify the stimuli. In this study, four categories of features such as statistical, power, entropy, 
and FD features are extracted to analyze which feature set performs efficient classification. Each 10 seconds 
trails are further segmented into 2 seconds pieces and the following feature extraction techniques are applied. 


3.2.1. Statistical features 

In this study, six statistical features mean [7], standard deviation [7], mean absolute value [12], root 
mean square [13], coefficient of variation [13]. 
a. Mean 


TX 
ae L (n) (1) 


where ‘uy denotes the mean of the data,‘X(n)’ denotes the data points and ‘N’ denotes the number of data 
points. 
b. Standard deviation 


pe (X(n)-ty)? 
ox= Se ee (2) 


where ‘ox’ denotes the standard deviation of the data, “X(n)’ denotes the data points, ‘Hy denotes the mean of 
the data points and ‘N’ denotes the number of data points. 
c. Mean absolute value 

MAV is calculated by taking the average of the absolute value of the data points. 


1 
M=log (ŞEN lx(0)I) 6) 
where ‘x(n)’ denotes the data points and ‘N’ denotes the number of data points. 


Root mean square (RMS) 
RMS is calculated by taking the square root of the averaged squared value of the data points [13]. 


RMS- [i x(n)? at (4) 


d. Coefficient of Variation (COV) 


Ox 
COV =— (5) 
Hx 


where ‘uy denotes the mean of the data and ‘ay’ denotes the standard deviation of the data. 
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3.2.2. Power features 

Power features include features extracted from the frequency domain of EEG signals. One of the 
widely adopted techniques for extracting power features is PSD. PSD is calculated using Welch’s method by 
taking the average of the Fourier transform of the segmented blocks of the original signal [11], 


EET P, (m) (6) 


pe Mol 
s£- 

K 
where P,,,(™m) is called periodogram of each block which is the result of the FFT applied over the segmented 
signals and ‘K’ represents the total number of segmented blocks in the original signals. 


3.2.3. Entropy features 

Entropy is recommended to extract non-linear features of EEG signals [7]. In this study, six 
categories of entropy features are extracted for the non-linear analysis of EEG data. 
a. Shannon entropy 

Shannon entropy is a measure of uncertainty present in the value. It quantifies the amount of 
information that a particular variable or data holds over the result [27]. It is defined as (7), 


H(X)=- X P; log, P; (7) 


where ‘n’ denotes the number of data points and ‘P,’ denotes the probability of a data point. 
b. Spectral entropy 

Spectral entropy (SE) represents the proportions of which power spectrum of the EEG signal is 
made which consists of ‘flats’ and ‘peaks’ distribution [28]. It is calculated by measuring Shannon’s entropy 
for PSD [7] by (8), 


SE= - XF? PSD(f) log, (PSDP) (8) 


where ‘f’ is half of the sampling frequency [7]. 
c. Permutation entropy 

Permutation entropy (PE) quantifies the information by analyzing the patterns of ranks of values 
present in the time series data [29]. It is defined by (9), 


PE=- H p' log, (p',) (9) 


where p’; denotes the number of times the pattern of a particular sequence occurs in a variable. 
d. Singular value decomposition entropy (SVDE) 

SVDE measures the dimensionality of the EEG data by analyzing the number of eigenvectors to 
represent the data [7]. It is defined by (10), 


SVDE= =- Xi- 6; log, 6; (10) 


where ‘o;’ denotes the values of the embedding space matrix of the delayed vector (also known as singular 
spectrum) of the input EEG data and ‘n’ denotes the number of singular spectrums. 
e. Approximate entropy and sample entropy 

Approximate entropy (ApEn) measures the degree of irregularity present in the data [30]. According 
to Steve Pincus, ApEn is defined as the “likelihood that runs of patterns that are close remain close on next 
incremental comparisons” [31]. The study demonstrates that ApEn performs well with relatively small time- 
series data. Sample entropy is used to examine the sequence and regularity present in the data and assigns a 
non- negative number to the sequence in such a way that the larger value denotes more irregularity present in 
the data [30]. Sample entropy can be defined as (11), 


SampEn(m,r) = Jim {- In [e } (11) 


where ‘m’ denotes the run length of data points, ‘r’ denotes tolerance window, A™ (r) denotes the probability 
of two m+1 matched sequences and B™ (r) denotes the probability of two m matched sequences. 
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3.2.4. Fractional dimension features 

FD features are another non-linear analysis technique used for analyzing EEG signals. It is used to 
measure the FD of a geometric object [32]. 
a. Katz’s FD 

Katz’s FD algorithms are calculated by derivating FD directly from the planar waveform [7], [32]. 
The Katz’s FD is calculated as (12) [33], 


FD= log( N) > ( 1 2) 
log (N)+ log DD 


where ‘d’ denotes the diameter of the waveform and ‘L’ is the length of the waveform. 
b. Petrosian FD 

Petrosian FD is computed by applying Katz’s FD over the binary sequences of the time-series data 
[7], [32]. The Petrosian FD is calculated as (13) [33], 


FD= PED (13) 


log (N)+ log oan) 


where ‘N,’ is the number of unique segment pairs present in the binary sequence. 
c. Higuchi’s FD 
Consider X(1), X(2), ..., X(N) be the time-series data points and is constructed as (14), 


XE:X(m),X(m+k), ..., X (m+ =] k) (14) 


where m=1,2, ..., k and ‘m’ denotes the starting point. °k’ denotes intervals between data points. For each 
‘k’, calculate the length of the curve by (15). 


N-m 
Gi Ixo xem DW.) 
Ly(k)=; <a] k a 


kI 


where ‘Lm (k) denotes the length of the curve. Then Higuchi’s FD is calculated by applying (16). 


FD= - lim 28) (16) 


koa = logk 


4. RESULTS AND ANALYSIS 


For finding out the better performance of EEG signals in classifying cognitive stimuli, we 
investigated which features or combination of features along with respective classifiers. Also, we suggested 
two methods and compared the feature extraction techniques based on that. Based on the comparison results, 
we provided our evaluation of the best set of features that could be used in the classification of cognitive 
stimuli-based EEG signals. In this study, the extracted features are given as input to the classifiers: RFC, 
KNN, MLP, LDA, and SVM and the accuracies are noted. Time and frequency domain features are then 
analyzed in two manners. 

a. Analysis of individual domain features: Combine each feature with respect to their categories and 
compare the accuracies of each category 

b. Analysis of hybrid domain features: Combine each category of features in multiple combinations and 
compare the accuracies of each combination of feature categories. 


4.1. Analysis of individual domain features 

The individual features that are extracted from the time-series data are combined and categorized in 
such a way that each feature belongs to one of the four categories named statistical, power, entropy, and FD 
features. Then each category is given as input to the classifiers and the accuracies are noted. In Table 2, 
accuracies of each feature category with all the five classifiers are shown and in Figure 2, the bar plot illustrates 
the accuracies of individual domain feature categories with all the five classifiers. It is shown that the FD feature 
has the highest accuracy of 82.53% with the LDA classifier. 
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Table 2. Performance analysis on individual domain features based on the accuracy 


Features RFC LDA SVM MLP KNN 
Statistical features 63.5% 41.3% 69.8% 38.1% 50.8% 
Power features 61.9% 17.5% 65.1% 38.1% 66.7% 
Entropy features 79.4% 54.0 % 74.6% 79.4% 71.8% 
FD features 76.2% 82.5% 66.7% 47.6% 74.6% 
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Figure 2. Performance analysis on individual domain features based on the accuracy 


4.2. Analysis of hybrid domain features 

The different combinations of four categories of features are made which consists of a total of 11 
unique combinations of categories where ‘S’ represents statistical feature, ‘P’ represents power feature, ‘E’ 
represents entropy feature, ‘F’ represents FD feature. These different categorical combinations of features are 
given as input to the classifiers and the accuracies are noted. In Table 3, accuracies of every combination of 
the hybrid feature with all the five classifiers are shown and in Figure 3, the bar plot illustrates the accuracies 
of hybrid domain features with all the five classifiers. It is shown that the combination of entropy-FD features 
gives the highest accuracy of 90.47% with the KNN classifier. 


Table 3. Performance analysis on hybrid domain features based on the accuracy 


Features RFC LDA SVM MLP KNN 
S-P 68.2% 46.0% 61.9% 46.0% 50.8% 
S-F 87.3% 28.6% 69.8% 52.4% 63.5% 
S-E 84.1% 19.0% 69.8% 52.4% 74.6% 
P-F 82.5% 87.3% 69.8% 46.0% 73.0% 
P-E 79.4% 34.9% 69.8% 50.8% 77.8% 
E-F 74.6% 69.8% 73.0% 81.0% 90.5% 

S-P-F 82.5% 30.2% 61.9% 46.0% 65.1% 
S-E-F 81.0% 27.0% 69.8% 47.6% 82.5% 
S-E-P 82.5% 19.0% 61.9% 50.8% 76.2% 
E-P-F 73.0% 47.6% 73.0% 61.9% 90.5% 

S-E-P-F 81.0% 31.8% 63.5% 50.8% 82.5% 
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Figure 3. Performance analysis on hybrid domain features based on the accuracy 


5. DISCUSSIONS 

From the results obtained in the current study, we demonstrated that hybrid domain feature analysis 
gives the highest accuracy of 90.47% when using hybrid features of entropy-FD categories and hybrid 
features of entropy-power-FD categories with KNN outperforming individual domain feature analysis which 
achieves 79.36% with RFC, MLP and 82.53% with LDA, respectively. Dutta et al. [26] proposed feature 
extraction techniques for mental task-based EEG signals classification by combining multivariate empirical 
mode decomposition (MEMD) and phase-based decomposition. The features hence extracted are in the time 
domain which made this mode easy to implement in real-time applications. The LS-SVM classifier is used to 
classify the extracted features which achieve the highest accuracy of 83.33%. Our proposed approach 
outperformed this method with the highest accuracy of 90.47%. The hybrid features of entropy-FD categories 
are sufficient for efficient performance in classification since combining power feature with entropy-FD 
features has no impact on the classification accuracy and it provides the same accuracy as entropy-FD 
combination. Hence, we can conclude that the entropy-FD feature along with the KNN classifier can 
effectively be used in the classification of cognitive stimuli-based EEG signals. 


6. CONCLUSION 

Brain-computer interface is a way of communicating between the brain and an external device both 
sharing the same interface that can be controlled externally. The main agenda of the project is to enhance 
classification accuracy using hybrid features. In BCI, it is suggested that using features from multiple 
domains improves classification accuracy. There are various methods for extracting features from the raw 
EEG signals such as statistical approaches, power features, wavelet features, etc. In this study, statistical 
features such as mean, standard deviation, MAV, RMS, COV, and power features such as PSD, FD 
approaches, and entropy features are extracted from the raw EEG signals. Two methods of combinations are 
employed to find the best combination of features along with its classifier. From the experiment, it is shown 
that FD features with LDA classifier give the better accuracy of 82.53% when compared to other features, 
and the combination of entropy-FD features with KNN classifier gives the highest accuracy of 90.47% when 
compared to other combinations. From the above-obtained results, it is suggested that the combination of 
entropy and FD features with the KNN classifier can be used to effectively classify the target class. Hence the 
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above combination with the respective model can be used for predicting the stimuli class of the tasks 
performed by the subjects from their brain signals very effectively. The above combination of hybrid feature 
sets might increase the computational complexity of the system as the data provided increases. Our work can 
be further extended by applying feature optimization and channel selection techniques to minimize the 
complexity of the existing model. 
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